<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 11">
<meta name=Originator content="Microsoft Word 11">
<link rel=File-List href="overview_files/filelist.xml">
<!--[if gte mso 9]><xml>
 <o:DocumentProperties>
  <o:Author>Randy Cox</o:Author>
  <o:LastAuthor>Randy Cox</o:LastAuthor>
  <o:Revision>2</o:Revision>
  <o:TotalTime>11</o:TotalTime>
  <o:Created>2007-09-24T02:23:00Z</o:Created>
  <o:LastSaved>2007-09-24T02:34:00Z</o:LastSaved>
  <o:Pages>1</o:Pages>
  <o:Words>222</o:Words>
  <o:Characters>1272</o:Characters>
  <o:Company>Pukoa Scientific</o:Company>
  <o:Lines>10</o:Lines>
  <o:Paragraphs>2</o:Paragraphs>
  <o:CharactersWithSpaces>1492</o:CharactersWithSpaces>
  <o:Version>11.6360</o:Version>
 </o:DocumentProperties>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:WordDocument>
  <w:SpellingState>Clean</w:SpellingState>
  <w:GrammarState>Clean</w:GrammarState>
  <w:ValidateAgainstSchemas/>
  <w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
  <w:IgnoreMixedContent>false</w:IgnoreMixedContent>
  <w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
  <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
 </w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:LatentStyles DefLockedState="false" LatentStyleCount="156">
 </w:LatentStyles>
</xml><![endif]-->
<style>
<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{mso-style-parent:"";
	margin:0in;
	margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:12.0pt;
	font-family:"Times New Roman";
	mso-fareast-font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;
	text-underline:single;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;
	text-underline:single;}
p
	{font-size:12.0pt;
	font-family:"Times New Roman";
	mso-fareast-font-family:"Times New Roman";}
pre
	{margin-top:0in;
	margin-bottom:0in;
	margin-bottom:.0001pt;
	font-size:10.0pt;
	font-family:"Courier New";
	mso-fareast-font-family:"Times New Roman";}
span.SpellE
	{mso-style-name:"";
	mso-spl-e:yes;}
span.GramE
	{mso-style-name:"";
	mso-gram-e:yes;}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;
	mso-header-margin:.5in;
	mso-footer-margin:.5in;
	mso-paper-source:0;}
div.Section1
	{page:Section1;}
-->
</style>
<!--[if gte mso 10]>
<style>
 /* Style Definitions */
 table.MsoNormalTable
	{mso-style-name:"Table Normal";
	mso-tstyle-rowband-size:0;
	mso-tstyle-colband-size:0;
	mso-style-noshow:yes;
	mso-style-parent:"";
	mso-padding-alt:0in 5.4pt 0in 5.4pt;
	mso-para-margin:0in;
	mso-para-margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:10.0pt;
	font-family:"Times New Roman";
	mso-ansi-language:#0400;
	mso-fareast-language:#0400;
	mso-bidi-language:#0400;}
</style>
<![endif]--><!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext="edit" spidmax="2050"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext="edit">
  <o:idmap v:ext="edit" data="1"/>
 </o:shapelayout></xml><![endif]-->
</head>

<body bgcolor=white lang=EN-US link=blue vlink=purple style='tab-interval:.5in'>

<div class=Section1><pre><span class=SpellE>WebSpider-cox</span></pre><pre><o:p>&nbsp;</o:p></pre><pre>The command line syntax is:<span style='mso-spacerun:yes'>&nbsp; </span>$java &#8211;jar <span
class=SpellE>WebSpider.jar</span> URL <span class=SpellE>PagesToCheck</span> <span
class=GramE>&#8211;[</span><span class=SpellE>totallinks</span>][<span
class=SpellE>mostpopular</span>][logging][help][<span class=SpellE>testparse</span>]</pre><pre><o:p>&nbsp;</o:p></pre><pre>URL<span
style='mso-tab-count:2'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>the website address to start your search (i.e. <a
href="http://www.test.com/">http://www.test.com</a>)</pre><pre><span
class=SpellE>PagesToCheck</span> <span style='mso-tab-count:1'>&nbsp; </span>the number of pages to traverse before ending</pre><pre>-<span
class=SpellE>totallinks</span><span style='mso-tab-count:1'>&nbsp;&nbsp;&nbsp; </span>count the number of links encountered</pre><pre>-<span
class=SpellE>mostpopular</span><span style='mso-tab-count:1'>&nbsp;&nbsp; </span>find the most popular page with the most links pointing to it</pre><pre>-help<span
style='mso-tab-count:2'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>print the command line syntax</pre><pre>-logging<span
style='mso-tab-count:1'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>prints out events as the program executes</pre><pre>-<span
class=SpellE>testparse</span><span style='mso-tab-count:1'>&nbsp;&nbsp;&nbsp;&nbsp; </span>prints out the programs flags and variables after the command line was parsed, for testing</pre><pre><o:p>&nbsp;</o:p></pre><pre>Websites starting with the URL passed in is retrieved. <span style='mso-spacerun:yes'>&nbsp;</span>Any links on the page are <span
class=GramE>save</span> and retrieved as well. <span style='mso-spacerun:yes'>&nbsp;</span>This is done until the <span
class=SpellE>PagesToCheck</span> amount is reached.<span style='mso-spacerun:yes'>&nbsp; </span>The total number of links and which link is the most popular is stored. <span style='mso-spacerun:yes'>&nbsp;</span>These values will be printed out if the &#8211;<span
class=SpellE>totallinks</span> and/or &#8211;<span class=SpellE>mostpopulare</span> options are used. <span style='mso-spacerun:yes'>&nbsp;</span>Logging info will be printed as the application executes if &#8211;logging is used.</pre><pre><o:p>&nbsp;</o:p></pre><pre>Testing</pre><pre>To unit the application, a controlled test website was constructed to provide exact results using this tool.<span style='mso-spacerun:yes'>&nbsp; </span>The test website is located at:</pre><pre><a
href="http://www2.hawaii.edu/~randycox/11.WebSpider/test/main.htm">http://www2.hawaii.edu/~randycox/11.WebSpider/test/main.htm</a></pre><pre>Although this introduces an external dependency to successful unit testing, the advantage of having a real web-site with custom build test scenarios outweighed the risk.</pre></div>

</body>

</html>
